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ABSTRACT 

Using a single-RNA detection technique in live 
Escherichia coli cells, we measure, for each cell, 
the waiting time for the production of the first RNA 
under the control of P B ad promoter after induction by 
arabinose, and subsequent intervals between tran- 
scription events. We find that the kinetics of the ara- 
binose intake system affect mean and diversity in 
RNA numbers, long after induction. We observed 
the same effect on Pi ac / ara -i promoter, which is 
inducible by arabinose or by IPTG. Importantly, the 
distribution of waiting times of Pi ac /ara-i is indistin- 
guishable from that of P B ad, if and only if induced 
by arabinose alone. Finally, RNA production under 
the control of P B ad ls found to be a sub-Poissonian 
process. We conclude that inducer-dependent 
waiting times affect mean and cell-to-cell diversity 
in RNA numbers long after induction, suggesting 
that intake mechanisms have non-negligible effects 
on the phenotypic diversity of cell populations in 
natural, fluctuating environments. 

INTRODUCTION 

Transcription in E. coli is, at a genome-wide scale, a rela- 
tively rare stochastic event (1-3). Further, many genes 
only become active in response to external stimuli (4-7), 
via processes that are also stochastic (7). Although much 
is known on the noise in gene expression at the single-cell 
level (1-3,7-10), most of our present knowledge concern- 
ing the kinetics of response, in terms of gene activity, to 
external signals concerns the average behaviour of cell 
populations alone (11). However, to characterize the 
dynamics and the underlying steps of intake processes, it 
is necessary to observe their effects in individual live cells 
(12). This observation should inform also on the 



robustness of cellular response mechanisms by informing 
on the degree of change in the responses of a single cell to 
multiple occurrences of the same stimulus, as well as the 
difference in responses to different stimuli. 

One of the most well-known gene activation mechan- 
isms is the arabinose utilization system of E. coli. 
This system imports arabinose into the cell by AraFGH, 
an arabinose-specific high-affinity ABC transporter 
(11,13-15), and by a low-affinity transporter, AraE, 
which binds to the inner membrane and makes use of 
electrochemical potential to intake the arabinose 
(11,16,17). This system exhibits wide variability in the 
timing of activation and in the rates of accumulation of 
inducer molecules (18). It has been hypothesized that this 
is due to the cell-to-cell variability in the numbers of 
proteins responsible for the intake of arabinose (18). 
Interestingly, if the intake gene araE is placed under the 
control of a constitutive promoter the intake rates become 
more homogenous (19-21), suggesting that the diversity in 
the number of intake proteins is a non-negligible source of 
cell-to-cell variability in the kinetics of the arabinose util- 
ization system (12). 

Evidence suggests that when the intracellular concentra- 
tion of arabinose exceeds a threshold, the dimeric AraC 
protein activates the genes that code for the proteins re- 
sponsible for the intake (AraE and AraFGH) and for the 
catabolism of arabinose (araBAD) (11,22). In the absence 
of arabinose, AraC binds two half-sites on the DNA 
(Ii and 0 2 ) and promotes the formation of a DNA loop 
that prevents access of RNA polymerases to the pro- 
moters in that region (P c and Pbad)- When bound by 
arabinose, AraC binds instead to the adjacent Ij and I 2 
half-sites. The resulting configuration promotes transcrip- 
tion initiation at Pbad Ql)- 

Transcription initiation is a complex, multi-stepped 
process (23,24). In vitro measurements suggest that this 
process has at least two to three rate limiting steps 
(25,26). It starts when the RNA polymerase binds to the 



*To whom correspondence should be addressed. Tel: +358408490736; Fax: +358331154989; Email: andre.ribeiro@tut.fi 
© The Author(s) 2013. Published by Oxford University Press. 

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.Org/licenses/by/3.0/), which 
permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. 



Nucleic Acids Research, 2013, Vol. 41, No. 13 6545 



promoter region of the DNA molecule, forming the closed 
complex, which is followed by the open complex forma- 
tion and promoter escape (27,28). The RNA polymerase 
then elongates the nascent RNA (28). Evidence suggests 
that, in general, initiation is much longer in duration than 
elongation (26,29). Recent in vivo measurements of the 
kinetics of initiation of P/„ c / ura ./ and V tetA promoters 
have shown that RNA production under the control of 
these promoters is a sub-Poissonian process (8-10). These 
studies also support the existence of multiple steps at the 
stage of initiation, significantly limiting the rate of RNA 
production, as suggested by in vitro measurements (30). 

Here, we investigate the degree of contribution of the 
process of intake of arabinose and of the process of tran- 
scription under the control of Pbad to the cell-to-cell 
diversity in RNA production. Namely, we report measure- 
ments of the in vivo kinetics of induction and transcript 
production of Pbad with single-molecule sensitivity, 
making use of the MS2d-GFP tagging of RNA in E. coli 
(31). For that, in each cell, we measure the waiting time 
until the first RNA is produced after induction and the 
subsequent intervals between consecutive transcript pro- 
ductions. For comparison, we conduct the same measure- 
ments for Piac/ara-i when induced by either of its two 
inducers, arabinose and IPTG. 



MATERIALS AND METHODS 

Strains and plasmids 

Escherichia coli strain DH5oc-PRO was generously 
provided by I. Golding, University of Illinois and 
contains the construct PROTET-K133, carrying Pueto-r 
MS2d-GFP (31), along with a new construct, pMK-BAC 
(P 5 ^£)-mRFPl-MS2-96bs), which is a single-copy F-based 
vector carrying a sequence coding for a monomeric red 
fluorescent protein (mRFPl) followed by a 96 binding 
site array under the control of Pbad (cloning information 
provided in Supplementary Methods) (see also 
Supplementary Figures SI and S2). The strain with 
plasmids P ife , 0 ./-MS2d-GFP and pIG-BAC (P fac/ara .i- 
mRFPl-MS2-96bs) (32) was used as well. The DH5a- 
PRO strain [identical to Zl (31)] is a genuine producer 
of AraC (33). No modifications were made to the chromo- 
some of this strain in our experiments. 

Media and growth conditions 

Cells were grown overnight at 30° C with aeration and 
shaking in Luria-Bertani (LB) medium, supplemented 
with antibiotics according to the plasmids. The cells 
were diluted in fresh M63 medium and allowed to grow 
until an optical density of OD 600 w0.3-0.5. To attain full 
induction of the MS2d-GFP reporter, cells were pre- 
incubated for 40 min with lOOng/ml anhydro tetracycline 
(aTc, IBA GmbH). The same protocol was used for each 
strain. 

Microscopy 

For microscopy measurements, cells were pelleted and re- 
suspended in ~50 |il of fresh M63 medium. Afterwards, 



few microlitres of cells were placed between a 3% 
agarose gel pad made with medium and a glass coverslip 
before assembling the imaging chamber (FCS2, 
Bioptechs). Before the starting of the experiment, the 
chamber was heated to 37°C. 

Cells were visualized in a Nikon Eclipse (TE2000-U, 
Nikon, Japan) inverted microscope with CI confocal 
laser-scanning system using a xlOO Apo TIRF objective. 
A flow of fresh, pre-warmed M63 medium containing the 
inducer was provided with a peristaltic pump at a rate of 1 
ml/min. Images were taken once per minute for 2h, and 
the laser shutter was open only during the exposure time 
to minimize photobleaching. The peristaltic pump was 
initialized at the same time as the collection of the time 
series. For image acquisition, we used Nikon EZ-C1 
software. GFP fluorescence was measured using a 
488 nm argon ion laser (Melles-Griot), 51 5/30 nm 
emission filter and a pixel dwell time of 3.36 (.is (total 
image acquisition time of 3.5 s). 

An interacting multiple model filter-based autofocus 
strategy (34) was used to correct focus drift in time 
series acquisitions. The method estimates the focal drift 
using an interacting multiple model filter algorithm to 
predict the focal drift at time t based on the measurement 
at t-l. It allows reducing the number of required images at 
different z-planes for drift correction, thus minimizing 
photobleaching. 

Data and image analysis 

Data and images were analysed using custom software 
written in MATLAB 2011b (Math Works). Cells were 
detected from fluorescence images by a semi-automatic 
method described previously (8). In time series, the area 
occupied by each cell was manually masked. Principal 
component analysis was used to obtain the dimensions 
and orientation of the cells within each mask. 
Fluorescent spots in the cells were automatically seg- 
mented using density estimation with a Gaussian kernel 
(35) and Otsu's thresholding (36). Finally, background- 
corrected spot intensities were calculated and summed to 
produce the total spot intensity in each cell. 

Moments of appearance of novel target RNA mol- 
ecules in each cell were obtained from time-lapse 
fluorescence images by fitting the corrected total spots 
intensity over time in each cell to a monotone 
piecewise-constant function by least squares (37). The 
number of terms was selected using the F-test with a 
P- value of 0.01. Each jump corresponds to the produc- 
tion of a single RNA molecule (37). An example of this 
procedure is shown in Figure ID. For more details on 
the image analysis see (8). Note that, in cells that do not 
contain target RNA molecules at the start of the meas- 
urements, the number of novel RNA molecules detected 
since the start of the measurements until a given moment 
equals the total number of RNA molecules in the cell at 
that moment. 

Because some cells already contained target RNA mol- 
ecules at the start of the measurement, the total RNA 
numbers within cells at a given moment in time is 
obtained using a different method. Specifically, when 
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Figure 1, Measurement system. (A) Components of the detection system. The expression of the tagging protein, MS2d-GFP, is controlled by F Llel0 
(33) and is inducible by anhydrotetracycline (aTc). The target RNA contains an mRFPl coding region, followed by an array of 96 MS2d-binding 
sites. Expression of the target RNA is controlled by Pbad whose activity is regulated by AraC and the inducer L-arabinose. The target construct is on 
a single-copy F-plasmid. The tagging construct is on a medium-copy vector. (B) Figurative description of the waiting time for the first RNA 
production (t 0 ) and intervals between subsequent productions (At). Images are taken once per minute for 2 h. (C) Example of E. coli cells expressing 
MS2d-GFP and target RNA. GFP-tagged RNA molecules are marked by circles. (D) Time course of total intensity of spots in a cell (circles) and 
monotone piecewise-constant fit (line). 



comparing measurements using MS2d-GFP tagging and 
using plate reader (Supplementary Figure S4), the total 
number of MS2d-GFP-tagged RNA molecules was ex- 
tracted from the total spot intensity distribution, 
obtained from all cells in an image obtained at a given 
moment after induction. For this, the first peak of the 
obtained distribution is set to correspond to the intensity 
of a single-RNA molecule. The number of tagged RNAs 
in each spot can be estimated by dividing its intensity by 
that of the first peak (32). 

RESULTS 

Experimental design 

To study the kinetics of expression of Pbad, we detect 
individual RNA molecules, as these are produced in live 
cells and register when these events occur. For this, we 
placed the Pbad promoter on a single-copy F-plasmid, 
followed by a coding region for mRFPl and an array of 
96 binding sites for MS2d-GFP-tagging proteins (32) 
(Figure 1A). The expression of MS2d-GFP is controlled 
by V Tet0 , which is activated before the gene of interest 
so that sufficient MS2d-GFP proteins are present 
when target RNA molecules appear. Induction of Pbad 
and image acquisitions is initialized simultaneously 
(Figure IB). For this, we use a temperature-controlled 
imaging chamber and a peristaltic pump for introducing 



inducers and fresh media. From the fluorescence images, 
using semi-automated cell segmentation and tracking 
(Figure 1C) (8), we measure in each cell the time for the 
first RNA to appear (named 'waiting time', t 0 ), as well as 
the subsequent intervals between consecutive RNA pro- 
ductions, At, until cell division occurs or until the end of 
the measurement period (Figure ID). 

Given that values of t 0 can only be obtained from cells 
of the first generation (i.e. cells already on the slide when 
the measurement begins), and as cells that do not divide in 
the first 2h will not, in general, divide afterwards, we 
limited the measurement period to 2h for simplicity. 
This was possible, as this period also proved to be suffi- 
cient to acquire enough samples of At. 

From cells born during the measurement period, we 
only extract intervals between consecutive RNA produc- 
tions, not waiting times, as these contain inducers by 
inheritance. We detected no difference in the distributions 
of intervals obtained from such cells and cells already 
present when induction is initiated. Finally, we observed 
~0.2 RNA molecules per cell, at the moment preceding 
induction, because of spurious transcription events. Cells 
where a target RNA was already present at the start of the 
measurement were also not used to obtain values of t 0 . 

First, we compared by quantitative polymerase chain 
reaction the RNA production from the F-plasmid and 
from the native gene under the control of Pbad 
(Supplementary Methods). Using 16S rRNA as reference 
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gene, we observe similar trend in activity over time in the 
native promoter and in the one on the F-plasmid 
(Supplementary Figure S3). 

We next compare expression levels of the target gene, 
when assessed by independent methods, for two induction 
levels, namely, 0.1 and 1% L-arabinose (Supplementary 
Methods). In Supplementary Figure S4A and B, we 
show the temporal variation after induction in mean 
numbers of MS2d-GFP-tagged RNA molecules in cell 
populations and in the fluorescence intensity of RFP 
measured by plate reader, respectively. 

The plate reader measurements of mRFPl levels, 2h 
after induction in liquid culture, show a fold change of 
1.67 times when L-arabinose is increased from 0.1 to 
1%. The MS2d-GFP in vivo detection method shows a 
fold change of 1.74 between these same conditions, 
showing that the results from the two methods are in ac- 
cordance. From this and the previous experiment, we also 
conclude that the MS2d-GFP tagging method accurately 
detects RNA production of the target gene, and that the 
target gene behaves similarly to the natural system. 

We also assessed for what range of inducer concentra- 
tions is the target gene under full induction. We measured 
with the plate reader its expression for varying inducer 
concentration, 2h after induction. From Supplementary 
Figure S5, maximum induction is achieved for 1% arabin- 
ose. Here onwards, unless stated otherwise, we use this 
concentration to assess the kinetics of RNA production 
under the control of Pbad- 

First RNA and intervals between consecutive RNA 
molecules in individual cells 

From the time-lapse images acquired with confocal mi- 
croscopy, after induction, we measure in each cell both 
t 0 and subsequent values of At. t 0 is expected to include 
the time for arabinose to enter the cell via the intake mech- 
anism, the time to find the promoter and release the re- 
pressor and also the time for the recruitment of the RNA 
polymerase and subsequent production of the first target 
RNA. The latter process includes events such as the closed 
and the open complex formation at the promoter region, 
as well as the elongation time. Both the elongation time 
and the time for MS2d-GFP to bind to a target RNA are 
expected to be negligible in comparison with the duration 
of the intake and of transcription initiation (8,12,31). 
Meanwhile, At should depend only on the events in tran- 
scription initiation (37). 

The distribution of values of the waiting times, t 0 , is 
shown in Figure 2A. Cells were induced in the gel with 
fresh media and 1% arabinose. The distribution is broad, 
as the waiting times spread through the measurement time 
and has a mean of 3071 s. 

The distribution of intervals between consecutive 
productions of target RNA molecules (At) is shown in 
Figure 2B. This production is a sub-Poissonian process, 
as the normalized variance (cr 2 /^ 2 ) of the distribution is 
0.37. Similar conclusions were obtained from measure- 
ments of the in vivo kinetics of RNA production under 
the control of P /m/ „ ra ./ and P tetA (9,10). 



The distributions in Figure 2A and B differ signifi- 
cantly. We verified this with a statistical testing of 
equality of two empirical distributions, the 
Kolmogorov-Smirnov (K-S) test. We obtained a 
P-value of 2.8 x 10~ 18 , much smaller than 0.05, which 
allows rejecting the null hypothesis of similarity. We 
conclude that in the case of Pbad and the arabinose 
intake mechanism, the time of intake of inducers affects 
significantly both mean and standard deviation of RNA 
numbers in individual cells, long after induction. Finally, 
note that the difference between the distributions of t 0 and 
At provides evidence that the activity of Pbad changes 
significantly with induction. Otherwise, these two distribu- 
tions should not differ significantly, as they would both 
result, e.g. from spurious transcription events alone. 

One recent study (12) also focuses on the in vivo induc- 
tion kinetics of Pbad- This study uses measurements of 
GFP levels in cell populations, whose expression is 
controlled by Pbad (inserted into a medium-copy vector) 
and a model to extrapolate the mean activation time of the 
promoter, after induction. Assuming a threshold for GFP 
levels to consider the promoter as active, the mean appear- 
ance time of GFP after induction was ~960 s. By consider- 
ing several features of the measurement system, including 
the mean maturation time of GFP, a value was then 
extrapolated for the expected activation time of the 
promoter, namely, ~250s. This does not include the 
time for transcription to be completed, once the closed 
complex is formed. This study thus predicts a faster 
mean initiation time than what our direct measurements 
indicate (~3000 s). Two main reasons exist for this differ- 
ence. First, in the mutant used previously (12), the 
chromosomal araBAD operon is deleted, avoiding 
the negative feedback mechanism, which likely fastens 
the response time significantly. Additionally, gene expres- 
sion was assessed from a medium-copy vector, which 
should respond much faster than the single-copy vector 
system used here, as its response time depends on the 
fastest of the response times of several promoter copies. 
Thus, we find that the results reported previously (12) and 
ours are in agreement. For example, while observing mean 
waiting times one order of magnitude longer, we do 
observe RNA molecules appearing in some of the cells 
within a time scale of 200^100 s after induction. 
Therefore, provided the usage of a multi-copy vector 
instead of the single-copy vector used here, we expect 
mean waiting times one order of magnitude smaller 
and thus in agreement with the measurements described 
previously (12). 

Correlations between consecutive processes 

To study whether the durations of the processes of intake 
and of transcription initiation are correlated, we first 
assessed whether consecutive intervals of At in individual 
cells are correlated. We measured the Pearson correlation 
from 101 pairs of consecutive intervals, and found it to be 
0.16. We obtained a P-value of 0.11, assuming no correl- 
ation as the null hypothesis, which implies that we cannot 
prove that the correlation is significant. This is in agree- 
ment with previous studies of P/ ac /ara-i kinetics, which also 
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Figure 2. Kinetics of the intake and production. (A) Probability density distribution of waiting times (u = 3071 s, a = 171 1 s) for the first RNA to be 
produced in cells induced by 1% L-arabinose (354 data points). (B) Probability density distribution of intervals between transcription events for V BAD 
when induced by 1% L-arabinose (u = 1672 s, a = 1012 s) (347 data points). 



indicate inexistence of correlation between durations of 
consecutive intervals between RNA productions (8). 

We next assessed whether the distributions of t 0 and 
values of At (Figure 2A and B) are correlated. Note that 
to and the At are of similar order of magnitude as the 
measurement period. This introduces artificial correlations 
between t 0 and At of individual cells, as, e.g. a cell with a 
large t 0 is expected to exhibit smaller than average At 
values, as larger intervals would not be detected during 
the measurement period as likely as in cells with smaller 
values of t 0 . To remove these artificial correlations 
between t 0 and At of individual cells, in this assessment, 
we only considered RNA productions for a certain 
window size (Supplementary Methods and Supplementary 
Figure S6). This window is set so as to maximize the 
number of data points that can be extracted from the 
measurements. 

From the windowed data, we calculated the Pearson 
correlation between t 0 and At values in individual cells 
to be —0.15. We calculated a P- value of 0.18 assuming 
no correlation as the null hypothesis, which implies that 
we cannot prove that the correlation is significant. This 
result is in line with (12), which reports a lack of correl- 
ation between initiation of protein expression and subse- 
quent rate of protein synthesis in individual cells. 

Dynamics of induction and of transcription initiation 
under different induction schemes 

The distinctiveness of the distributions of t 0 and At of 
Pbad, as assessed by the K-S test, suggests that they 
are, partially, the result of different processes. Although 
t 0 ought to depend on the kinetics of intake of arabinose 
and on the first transcription initiation event, At values 
ought to depend mostly on the kinetics of transcription 
initiation events alone. 

These assumptions arise from the following. First, 
in vitro and in vivo measurements (26,38) suggest that tran- 
scription initiation (including closed and open complex 
formation) is a long-duration, multi-step process, usually 
taking 10 2 -10 3 s in bacterial promoters (10,25,26,37,38). 
Other events that need to occur before the appearance 



of a target RNA because of the tagging of the MS2d- 
GFP are not expected to affect At significantly. These 
are transcription elongation and the tagging by multiple 
MS2d-GFP. Elongation of the target RNA was measured 
to take only tens of seconds (31). Also, the tagging occurs 
at a rate that makes the RNA visible during elongation or 
shortly after (31). 

To test the two assumptions, we measured the distribu- 
tions of t 0 and At for another promoter, Pi ac /ara-h i n two 
conditions. Pi ac /ara-i can be induced either by IPTG or by 
arabinose (as Pbad), or by both inducers simultaneously 
(9). According to our assumption, the distribution of t 0 of 
Pbad is expected to be similar to that of Pi ac /ara-i when the 
latter is induced by arabinose, because of depending on 
the same intake mechanism, whereas it should differ sig- 
nificantly when Piac/ara-i is induced by IPTG, given the 
different intake mechanisms of IPTG. 

We measured the distributions of t 0 and At for Pi ac /ara-i 
when induced by IPTG alone and when induced by 
arabinose alone (Table 1). We used the same concentra- 
tion of arabinose as when inducing Pbad- The IPTG con- 
centration used is the one required for maximum 
induction of Pi ac /am-i (33). Results in Table 1 follow the 
windowing procedure described earlier in the text. The 
table shows mean, standard deviation and square of the 
coefficient of variation (^ 2 /a 2 ) of t 0 and of At for the two 
promoters, each of which in two induction schemes. 

We first assessed the distinctiveness of the distributions 
of t 0 and At by the K-S test, for each promoter in each 
condition (Table 2). In all cases, these two distributions 
differ in a statistical sense. This is in agreement with the 
assumption that although both At and t 0 depend on the 
kinetics of initiation at the promoter, only t 0 depends on 
the kinetics of intake of inducers. 

We next performed statistical tests to assess the distinct- 
iveness between the induction kinetics (t 0 ) of the two pro- 
moters (Table 3), when subject to the same inducer and 
when subject to different inducers. Also, we compared the 
effects of a different inducer concentration in the case of 
Pbad- From Table 3, when Pbad an d Pi ac / ara .i are induced 
with 1% arabinose, they exhibit distributions of t 0 that 
cannot be distinguished. However, when Piac/ara-i is 
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Table 1. Measurements of t 0 and At 



Promoter 


Inducer 


No. of samples (At) 


uai (s) 


Oai (s) 


aV 


No. samples (t 0 ) 


Iko (s) 


CTio (s) 


aV 


^BAD 


1% arabinose 


102 


1440.6 


532.8 


0.14 


84 


2885.0 


1159.8 


0.16 


^BAD 


0.1% arabinose 


78 


1475.4 


481.2 


0.11 


70 


3519.4 


1236.2 


0.12 


F " lacjara-1 


1% arabinose 


149 


1516.5 


516.0 


0.12 


125 


2832.5 


1184.6 


0.17 


p 


1 mM IPTG 


485 


1314.4 


576.0 


0.19 


286 


2697.0 


913.6 


0.11 



The table shows the mean the standard deviation (a) and the normalized variance (a /u ) of the measured distributions of t 0 and At. 



Table 2. P-values of the Kolmogorov-Smirnov test between t 0 and 
At distributions for each promoter and induction condition 



Promoter 


Inducer 


P-value 


Pbad 


1% arabinose 


2.83 x 10~ 18 


Pbad 


0.1% arabinose 


4.06 x 10~ 21 


p 

1 tac/wa-l 


1% arabinose 


2.48 x 10- 26 


P * lacjara-1 


1 mM IPTG 


3.32 x 10~ 72 



For P < 0.05, it is generally accepted that the hypothesis that the two 
distributions are the same should be rejected. 



Table 3. P-values of the Kolmogorov-Smirnov test between t 0 
distributions for each promoter and induction condition 





PbAD 


PbAD 


^ lac/ara-1 


^ lac jar a- 1 




1% arab 


0.1% arab 


1% arab 


IPTG 


Pbad 1% arab 


1 








Pbad 0.1% arab 


5.93 x 10~ 4 


1 






PfacA*™-/ 1% arab 


0.8533 


1.10 x 10~ 4 


1 




Ptac/ara-l IPTG 


0.0126 


4.49 x 10~ 12 


0.0049 


1 



For P < 0.05, it is generally accepted that the hypothesis that the two 
distributions are the same should be rejected. 



induced with IPTG, the resulting t 0 distribution is statis- 
tically distinguishable from that of Pbad, when induced by 
either 0.1 or 1% arabinose. It is also distinct from its own 
t 0 distribution when induced by 1% arabinose. This stat- 
istically significant difference supports the hypothesis that 
the distributions of t 0 are dependent on the kinetics of the 
intake system of the inducers, and that these differ for 
IPTG and arabinose. 

Finally, we observed that the distributions of t 0 of Pbad, 
when induced by 0.1% and by 1% arabinose, are distinct. 
This is expected as the time for inducers to 'first reach the 
promoter' ought to depend on the inducer's 
concentration. 

Kinetics of the intake process 

The intake time of an inducer, here named 't diff ', differs 
from t 0 in that it does not include the time for the first 
transcription initiation event to occur. Because of this, t diff 
cannot be measured directly with the MS2-GFP-tagging 
method. We thus estimate the mean and variance of the 
distribution of values of t diff by subtracting the means and 
variances of the At distribution from the t 0 distribution. 
This method is based on the fact that we were unable to 



establish the existence of a correlation between the values 
of t 0 and At. Given this, and as they are, at most, weakly 
correlated (Pearson correlation of —0.15), we assume that 
they are independent so as to be able to estimate the 
standard deviation of the duration of the intake process 
alone (note that the mean of this quantity can be estimated 
as described later in the text, regardless of the existence of 
dependence). 

The estimated mean and a standard deviation of t diff are 
similar for Pbad and for Piaciam-u when induced with 1% 
arabinose. Namely, in both cases, we obtained a mean of 
~ 1400 s and a standard deviation of ~1100s. This is 
expected, given the usage of the same intake mechanism 
and inducer concentration. Importantly, when P/ ac /„ ra ./ is 
induced by IPTG, the standard deviation of t diff is much 
smaller (~700s), whereas the mean is similar to when 
induced by arabinose (~ 1400 s). This suggests that the 
intake of arabinose is a noisier process (concerning the 
uncertainty of the intake time) than the intake of IPTG. 
Finally, we find that in the case of Pbad, the concentration 
of arabinose affects the mean of tdiff significantly, as it 
equals ~2000 s for 0.1% arabinose. 

Effect of the intake process on the temporal cell-to-cell 
diversity in RNA numbers 

Because of being stochastic and thus variable in duration 
from one event to the next (i.e. it differs from one cell to 
the next), the intake process impacts on the diversity in 
RNA numbers of a cell population. This impact should 
decrease with time, after induction. We estimated the time 
during which the effect is tangible for each measurement 
condition. For this, we assume that values of t 0 depend 
mostly on the intake of arabinose and on the first tran- 
scription initiation event at the start site of Pbad- 
Meanwhile, the distribution of intervals between consecu- 
tive RNAs is assumed to depend solely on the kinetics of 
transcription initiation (8,10,37). 

The events determining At as well as t 0 are modelled as 
d-step processes, each step with an exponentially 
distributed duration (Supplementary Methods) (37). 
From this assumption, it is possible, for a given number 
of steps, to find the duration of each step that best fits the 
measurements. We assume transcription initiation to be a 
three-step process, namely, the closed complex formation, 
the open complex formation and promoter escape (27,38), 
as evidence suggests that these are the most rate-limiting 
steps in normal conditions, i.e. the ones most contributing 
to the intervals between production of consecutive RNA 
molecules (26). This assumption also relies on recent 
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Figure 3. Mean and Fano factor of transient times for different models of intake and subsequent RNA production kinetics. Mean (A) and Fano 
factor (B) of RNA numbers as obtained by CME models of activation and expression. The models shown are that of Vi ac / ara .j 
(dashed line), P Wara _ y 
intake (solid line). 



with 1 mM IPTG 

with 1% arabinose (dotted line), V BAD with 1% arabinose (dash-dotted line) and Pbad with 1% arabinose and infinitely fast 



studies (37) that indicate that assuming this number of 
steps suffices to generate distributions that cannot be dis- 
tinguished, in a statistical sense, from measurements with 
accuracy and quantity of data similar to the measurements 
reported here. Finally, we assume the intake to be a 
two-step process, namely, the binding of extracellular 
arabinose to an uptake protein and, once bound, its trans- 
location to the cytoplasm (12). The combination of the 
two processes (intake followed by transcription initiation) 
is, consequently, assumed to be a five-step process. 

Assuming these numbers of steps and stable conditions 
(e.g. induction level), we searched for models that fit the 
distributions accurately enough so that the K-S test does 
not find differences between model and measurements. 
The P-values of these tests are shown in Supplementary 
Table SI and show that in all but one case, it is possible to 
find a model that cannot be distinguished from the empir- 
ical distribution, in a statistical sense. 

The case for which we could not find a model that fits 
the measurements is that of Pbad at 0.1% arabinose in- 
duction. This may be due to lack of sufficient data or 
because the model is unsuitable. Future studies are 
required to assert this. One explanation may be that, in 
this case, the distribution of intake times results from two 
distinct kinetics, one being the productions under induc- 
tion and the other being spurious productions by pro- 
moters in the 'non-induced' state. 

Given the models aforementioned and provided a rate 
of RNA degradation, it is possible to estimate the time it 
takes for the mean RNA numbers of a model cell popu- 
lation to reach equilibrium, as this time depends solely on 
the rate of degradation of RNAs and t 0 . We do not have 
measurements of the degradation rate of the target RNA, 
as the tagging with MS2d-GFP 'immortalizes' it for the 
duration of the measurements (32). Instead, the models in 
Figure 3 assume an RNA degradation rate of 5min _1 , 
which is within realistic intervals for E. coli (1). 

From all of the aforementioned data, we estimated the 
mean times for RNA numbers to reach near-equilibrium, 



as well as the Fano factor of this quantity since the start of 
the simulations. Results are shown in Figure 3, as 
estimated for each of the models. Also shown is an esti- 
mation that assumes the model of transcription initiation 
of Pbad when induced by 1% arabinose, coupled with an 
infinitely fast intake. 

In all cases, reaching equilibrium in mean RNA 
numbers takes > 1 h, except when assuming infinitely fast 
intake, in which case the time to reach equilibrium is <0.5 
h. Thus, for a time length as long as 1-2 h, the intake 
process has a non-negligible contribution on the mean 
and the on the cell-to-cell diversity in RNA numbers of 
the cell populations. From Figure 3A, one also observes 
different shapes in the curves of Pi ac /ara-i when induced by 
IPTG (dashed line) and when induced by arabinose 
(dotted line), because of differing intake kinetics. 

From Figure 3B, the contribution of the intake kinetics 
on the cell-to-cell variability in RNA numbers is also sig- 
nificant. For example, the kinetics of intake causes an 
increase in the Fano factor in the initial moments not ob- 
servable in the case of infinitely fast intake. 

We also tested models of Pbad induced by 1% arabin- 
ose (normal and infinitely fast intake) with other RNA 
degradation rates (Supplementary Figure S7), within real- 
istic intervals (1). Aside from assessing the degree of 
dependency on the intake time and degradation rate, 
one also observes from the figure that although the 
latter determines the rate at which the system reaches equi- 
librium, the former acts as a delay towards reaching the 
numbers at equilibrium. Further, one can see that 
the intake step adds diversity to the RNA numbers in 
the cells, during the transient to reach equilibrium. 

DISCUSSION 

We measured, at the single-cell level, how long it takes for 
the first RNA under the control of Pbad to be produced, 
followed the introduction of the inducer in the media. 
Also, we measured the subsequent intervals between 
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consecutive RNA productions. From the intervals 
between transcription events, we determined that RNA 
production under the control of Pbad is a SUD - 
Poissonian process. Two recent studies reached similar 
conclusions for P/ ac /„,.„./ and P tet A, for all induction con- 
ditions tested (9,10). We hypothesize that this may be a 
common phenomenon because of the kinetic properties of 
the process of transcription initiation in bacteria, in par- 
ticular, because of its multi-stepped nature. 

From the distributions of the time, it takes for the 
appearance of the first RNA in each cell when under 
the control of Pbad and of Pi ac /ara-h f° r different induc- 
tion conditions, we assessed the effect of the kinetics of 
the intake process on the mean and cell-to-cell diversity 
in RNA numbers of cell populations. Relevantly, this 
effect was found to be tangible for a long period after 
induction. Also, we verified that different intake mech- 
anisms differ significantly not only in mean but also 
in the degree of variability of the intake time, and 
that this has a non-negligible effect on RNA population 
statistics. 

Given the aforementioned data, and considering that 
natural environments are fluctuating, we expect the 
kinetics of cellular intake mechanisms to have a significant 
effect on the degree of phenotypic diversity of cell popu- 
lations. Finally, we expect the methodology used here to 
assess the in vivo kinetics of intake of arabinose and of 
IPTG to be applicable to any gene of interest. Such studies 
should provide valuable insight into the adaptability of 
prokaryotic organisms to environmental changes and 
stress. They should also provide a better understanding 
of the observed cell-to-cell phenotypic diversity in E. coli 
when in fluctuating environments. 
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Online: Supplementary Table 1, Supplementary Figures 
1-7, Supplementary Methods and Supplementary 
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